智能论文笔记

NeeDrop: Self-supervised Shape Representation from Sparse Point Clouds using Needle Dropping

Alexandre Boulch , Pierre-Alain Langlois , Gilles Puy , Renaud Marlet

分类：计算机视觉 | 机器学习

2021-11-30

最近对隐含形状表示的兴趣日益增长。与明确的陈述相反，他们没有解决局限性，他们很容易处理各种各样的表面拓扑。为了了解这些隐式表示，电流方法依赖于一定程度的形状监督（例如，内部/外部信息或距离形状知识），或者至少需要密集点云（以近似距离 - 到 - 到 - 形状）。相比之下，我们介绍{\方法}，一种用于学习形状表示的自我监督方法，从可能极其稀疏的点云。就像在水牛的针问题一样，我们在点云上“掉落”（样本）针头，认为，静统计地靠近表面，针端点位于表面的相对侧。不需要形状知识，点云可以高稀疏，例如，作为车辆获取的Lidar点云。以前的自我监督形状表示方法未能在这种数据上产生良好的结果。我们获得定量结果与现有的形状重建数据集上现有的监督方法标准，并在Kitti等硬自动驾驶数据集中显示有前途的定性结果。

translated by 谷歌翻译

A Practical Introduction to Side-Channel Extraction of Deep Neural Network Parameters

Raphael Joud , Pierre-Alain Moellic , Simon Pontie , Jean-Baptiste Rigaud

分类：机器学习

2022-11-10

Model extraction is a major threat for embedded deep neural network models that leverages an extended attack surface. Indeed, by physically accessing a device, an adversary may exploit side-channel leakages to extract critical information of a model (i.e., its architecture or internal parameters). Different adversarial objectives are possible including a fidelity-based scenario where the architecture and parameters are precisely extracted (model cloning). We focus this work on software implementation of deep neural networks embedded in a high-end 32-bit microcontroller (Cortex-M7) and expose several challenges related to fidelity-based parameters extraction through side-channel analysis, from the basic multiplication operation to the feed-forward connection through the layers. To precisely extract the value of parameters represented in the single-precision floating point IEEE-754 standard, we propose an iterative process that is evaluated with both simulations and traces from a Cortex-M7 target. To our knowledge, this work is the first to target such an high-end 32-bit platform. Importantly, we raise and discuss the remaining challenges for the complete extraction of a deep neural network model, more particularly the critical case of biases.

translated by 谷歌翻译

A Closer Look at Evaluating the Bit-Flip Attack Against Deep Neural Networks

Kevin Hector , Mathieu Dumont , Pierre-Alain Moellic , Jean-Max Dutertre

分类：机器学习

2022-09-28

深度神经网络模型大量部署在各种硬件平台上。这导致出现新的攻击向量，这些攻击向量大大扩展了标准攻击表面，这是由对抗机器学习社区广泛研究的。旨在通过瞄准存储在内存中的参数（权重）的第一个旨在极大地降低模型性能的攻击之一是位翼攻击（BFA）。在这项工作中，我们指出了与BFA相关的一些评估挑战。首先，标准威胁模型中缺乏对手的预算是有问题的，尤其是在处理身体攻击时。此外，由于BFA提出了关键的可变性，因此我们讨论了某些培训参数的影响以及模型体系结构的重要性。这项工作是第一个介绍BFA对与卷积神经网络相比呈现不同行为的完全连接体系结构的影响的作品。这些结果突出了定义鲁棒和合理评估方法的重要性，以正确评估基于参数的攻击的危险，并衡量国防提供的实际鲁棒性水平。

translated by 谷歌翻译

Building Open Knowledge Graph for Metal-Organic Frameworks (MOF-KG): Challenges and Case Studies

Yuan An , Jane Greenberg , Xintong Zhao , Xiaohua Hu , Scott McCLellan , Alex Kalinowski , Fernando J. Uribe-Romo , Kyle Langlois , Jacob Furst , Diego A. Gómez-Gualdrón

分类：人工智能

2022-07-10

金属有机框架（MOF）是一类模块化的多孔晶体材料，具有巨大的革命性应用，例如储气，分子分离，化学感应，催化和药物输送。剑桥结构数据库（CSD）报告了10,636个合成的MOF晶体，此外还包含CA。114,373个类似MOF的结构。综合数量（加上可能合成的）MOF结构数量庞大，需要研究人员追求计算技术来筛选和分离MOF候选物。在此演示论文中，我们描述了我们在利用知识图方法方面促进MOF预测，发现和综合方面的努力。我们提出了有关（1）从结构化和非结构化来源构建MOF知识图（MOF-KG）的挑战和案例研究，以及（2）利用MOF-KG来发现新知识或缺失知识。

translated by 谷歌翻译

Classifying Emails into Human vs Machine Category

Changsung Kang , Hongwei Shang , Jean-Marc Langlois

分类：自然语言处理 | 机器学习

2021-12-14

它是雅虎邮件的重要产品要求，以区分个人和机器生成的电子邮件。雅虎邮件的旧生产分类器基于一个简单的逻辑回归模型。该模型通过在SMTP地址级别的聚合功能进行培训。我们建议在消息级别建立深入学习模型。我们构建并训练了四个单独的CNN模型：（1）具有主题和内容的内容模型作为输入; （2）发件人模型，发件人电子邮件地址和名称为输入; （3）通过分析电子邮件收件人的动作模式和相应地基于发件人的开/删除行为的目标标签进行操作模型; （4）通过利用发件人的“显式称呼”信号作为正标签来称呼模型。接下来，在探索上述四种模型的不同组合后，我们建立了最终的完整模型。与旧生产模型相比，我们的全部模型从70.5％提高到78.8％的调整后召回，同时抬起94.7％至96.0％的精度。我们的完整模式也显着击败了这项任务的最先进的BERT模型。此全模型已部署到当前的生产系统（雅虎邮寄6）中。

translated by 谷歌翻译

Efficient and robust high-dimensional sparse logistic regression via nonlinear primal-dual hybrid gradient algorithms

Jérôme Darbon , Gabriel P. Langlois

分类：机器学习

2021-11-30

Logistic回归是广泛使用的统计模型，以描述数据集中的二进制响应变量和预测变量之间的关系。它通常用于机器学习以识别重要的预测因子变量。此任务，变量选择，通常是拟合由$ \ ell_1 $和$ \ ell_ {2} ^ {2} $惩罚的凸组合规范化的逻辑回归模型。由于现代大数据集可以包含数十亿到数十亿的预测变量，因此可变选择方法取决于有效且强大的优化算法来执行良好。然而，可变选择的最先进的算法并不传统地设计用于处理大数据集;它们的规模差或易于产生不可靠的数值结果。因此，在大数据集上执行变量选择，它仍然具有挑战性，而无需获得足够的计算资源和昂贵的计算资源。在本文中，我们提出了一种解决这些缺点的非线性原始双向算法。具体而言，我们提出了一种迭代算法，其通过$ O（t（m，n）\ log（1 / \ epsilon））$业务，其中$ \ epsilon \在（0,1）$表示公差和$ t（m，n）$表示在数据集中执行矩阵矢量乘法所需的算术运算数，每个$ m个包含$ n $功能。这一结果提高了$ O的已知复杂性（\ min（m ^ 2n，mn ^ 2）\ log（1 / \ epsilon））$，因为一阶优化方法，如经典的原始 - 双混合梯度或向前-Backward拆分方法。

translated by 谷歌翻译

Deep learning for time series classification: a review

Hassan Ismail Fawaz , Germain Forestier , Jonathan Weber , Lhassane Idoumghar , Pierre-Alain Muller

分类：

2018-09-12

Time Series Classification (TSC) is an important and challenging problem in data mining. With the increase of time series data availability, hundreds of TSC algorithms have been proposed. Among these methods, only a few have considered Deep Neural Networks (DNNs) to perform this task. This is surprising as deep learning has seen very successful applications in the last years. DNNs have indeed revolutionized the field of computer vision especially with the advent of novel deeper architectures such as Residual and Convolutional Neural Networks. Apart from images, sequential data such as text and audio can also be processed with DNNs to reach state-of-the-art performance for document classification and speech recognition. In this article, we study the current state-ofthe-art performance of deep learning algorithms for TSC by presenting an empirical study of the most recent DNN architectures for TSC. We give an overview of the most successful deep learning applications in various time series domains under a unified taxonomy of DNNs for TSC. We also provide an open source deep learning framework to the TSC community where we implemented each of the compared approaches and evaluated them on a univariate TSC benchmark (the UCR/UEA archive) and 12 multivariate time series datasets. By training 8,730 deep learning models on 97 time series datasets, we propose the most exhaustive study of DNNs for TSC to date.

translated by 谷歌翻译